Correcting Spelling Errors by Modelling Their Causes
نویسندگان
چکیده
This paper accounts for a new technique of correcting isolated words in typed texts. A language-dependent set of string substitutions reflects the surface form of errors that result from vocabulary incompetence, misspellings, or mistypings. Candidate corrections are formed by applying the substitutions to text words absent from the computer lexicon. A minimal acyclic deterministic finite automaton storing the lexicon allows quick rejection of nonsense corrections, while costs associated with the substitutions serve to rank the remaining ones. A comparison of the correction lists generated by several spellcheckers for two corpora of English spelling errors shows that our technique suggests the right words more accurately than the others.
منابع مشابه
Design and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملA Multi-Agent System for Detecting and Correcting "Hidden" Spelling Errors in Arabic Texts
: In this paper, we address the problem of detecting and correcting hidden spelling errors in Arabic texts. Hidden spelling errors are morphologically valid words and therefore they cannot be detected or corrected by conventional spell checking programs. In the work presented here, we investigate this kind of errors as they relate to the Arabic language. We start by proposing a classification o...
متن کاملCorrecting real-word spelling errors by restoring lexical cohesion
Spelling errors that happen to result in a real word in the lexicon cannot be detected by a conventional spelling checker. We present a method for detecting and correcting many such errors by identifying tokens that are semantically unrelated to their context and are spelling variations of words that would be related to the context. Relatedness to context is determined by a measure of semantic ...
متن کاملThree-Phase Text Error Correction Model for Korean SMS Messages
In this paper, we propose a three-phase text error correction model consisting of a word spacing error correction phase, a syllablebased spelling error correction phase, and a word-based spelling error correction phase. In order to reduce the text error correction complexity, the proposed model corrects text errors step by step. With the aim of correcting word spacing errors, spelling errors, a...
متن کاملAction Research: Solving Sixth Grade Students Writing Disorder Based on the Results of Nepsy Neuropsychology Test
Introduction: Learning to write is very important because writing is a powerful tool for thinking, learning, and the need to continue studying to enter a variety of professions. Meanwhile, there are students who have difficulty learning to write for a variety of reasons. Therefore, the present study was conducted with the aim of eliminating the dictation disorder of sixth grade elementary stude...
متن کامل